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Abstract 

We reformulate the transport equation which determines the size, shape and orientation 
of infinitesimal light beams in arbitrary spacetimes. The behaviour of such light beams 
near vertices and conjugate points is investigated, with special attention to the singular 
behaviour of the optical scalars. We then specialize the general transport equation to 
the case of an approximate metric of an inhomogeneous universe, which is a Friedmann 
^ metric 'on average' with superposed isolated weak matter inhomogeneities. In a series 
^j. of well-defined approximations, the equations of gravitational lens theory are derived. 
(N Finally, we derive a relative optical focusing equation which describes the focusing of 
light beams relative to the case that the beam is unaffected by matter inhomogeneities 
^ in the universe, from which it follows immediately that no beam can be focused less than 
O one which is unaffected by matter clumps, before it propagates through its first conjugate 
^ point. 
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1 Introduction 



The propagation of light rays in curved spacetimes is described by the equation for 
null geodesies. Below, we consider congruences of light rays, so-called light beams (for 
an exact definition, see Sect. 2) and study their propagation in arbitrary spacetimes. 
Infinitesimal light beams are described by Jacobi's differential equation for deviation 
vectors. In this paper, we study some properties of the solutions of this propagation 
equation. In particular, we provide a detailed study of the behaviour of light beams near 
vertices and conjugate points. The behaviour of the optical scalars (Sachs 1961) which 
may diverge near conjugate points is determined. We find the leading-order behaviour 
of the convergence, shear and twist of light beams and their relation to the optical tidal 
matrix which represents the source of beam deformation. 

We then specialize the propagation equation to the case that the metric can be 
described by that of a Friedmann universe, with superposed weak local inhomogeneities; 
this is the situation most relevant for the light propagation in the universe. Here, the 
optical tidal matrix can be split into a contribution due to the background universe and 
one due to the local inhomogeneities, which is described in the first post-Minkowskian 
approximation. The background universe is assumed to have the overall geometry of a 
smooth Friedmann universe, but is locally modified due to matter inhomogeneities. 

If the matter inhomogeneities along the light beam are well localized, i.e., if the spa- 
tial extent of the inhomogeneities is much smaller than the distance from the source to an 
observer, the contributions from the inhomogeneities can be described in the impulse ap- 
proximation, in which the contribution to the optical tidal matrix due to inhomogeneities 
is replaced by a sum of delta-distributions. We will then show that this approximation 
leads to the gravitational lens equations, which are usually used to describe the influence 
of weak matter inhomogeneities on light propagation in the universe (for a review on 
gravitational lens theory and its applications, see Schneider, Ehlers & Falco 1992, here- 
after SEF). Hence, the gravitational lens equations follow from the exact propagation 
equations for light beams with a series of well-defined approximations. 

The behaviour of the cross-sectional area of an infinitesimally small light beam is 
described by the optical focusing equation (Sachs 1961), which contains the trace of the 
optical tidal matrix and the shear of the light beam as source terms. We will show that a 
relative optical focusing equation can be obtained which describes the cross-sectional area 
of a beam relative to one which is unaffected by matter inhomogeneities. The uniquely- 
determined independent variable for this relative focusing equation is the x-function 
introduced for other reasons in Sect. 4.6 of SEF. From this relative focusing equation it 
follows directly that no light beam can be less focused than one which is unaffected by 
matter inhomogeneities before the beam propagates through its first conjugate point. In 
the frame of gravitational lens theory, this fact has been proved earlier (Schneider 1984, 
Seitz & Schneider 1992, hereafter Paper I, 1994). 

2 Infinitesimal light beams 

In this section we review some consequences of the fact that, according to the geomet- 
rical optics approximation to Maxwell's equations in an arbitrary spactime (M,g a p) } a 
locally nearly plane electromagnetic wave, propagating without interaction with matter, 
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is associated with a hypersurface-orthogonal congruence of null geodesies representing 
light rays. We denote the corresponding phase function by S and the wave vector by 
k a = —g af3 S^; then k a k a = and k a := k^pk 13 = 0. (For details concerning this 
section see, e.g., SEF, Chapt. 3 and Wald 1984, Sect. 9.2 & 9.3., see also Blandford & 
Narayan 1992.) 

We fix attention on one "central" light ray 70 and denote by Y a any deviation vector 
field (Jacobi field) "connecting" 70 to one of its neighbours. Then, k a Y a is constant on 
7o- Deviation vectors differing by a (constant) multiple of k a represent displacements to 
the same nearby ray. Given the four velocity U a of an observer at an event p on 70, one 
can always arrange that Y a is spatial for U a , i.e., U a Y a = 0. 

Two events p, q on 70 are said to be conjugate if there exists a not identically 
vanishing Jacobi field which is zero at p and q. For such a Jacobi field, k a Y a = 0. 
A deviation vector satisfying the last equation (whether it vanishes somewhere or not) 
connects rays contained in the same phase hypersurface S = const. 

Henceforth we consider exclusively 2-parameter families of rays contained in one 
phase hypersurface which we call beams. Their deviation vectors obey k a Y a = 0, con- 
sequently the size, shape and orientation of an infinitesimal cross section of a beam is 
independent of the 4- velocity of the oberver who measures it. 

Given the 4- velocity U a of an observer at an event p on 70 , one can choose deviation 
vectors to all neighbouring rays such that, besides k a Y a = 0, also U a Y a = 0. Such 
vectors Y a span a 2-dimensional, spacelike subspace of the tangent space M p of p which 
we call a screen adapted to k a , U a . 

In studying conjugate pairs on a ray 70 it suffices to consider deviation vectors 
belonging to a beam surrounding 70. 

For gravitational lensing, the important beams are those which are contained in 
either the future null cone C+ of an event s - flashes of light emitted from a source event 
s - or the past null cone C~ of an observation event o. (In the second case, the rays 
of a beam belong to different, usually mutually incoherent locally plane waves, emitted 
from different source events. This does not matter for the applications considered in this 
paper. It is often helpful to think of the rays as [classical models of] photons.) In the 
remainder of this paper we are concerned with such beams only. 

C~ is generated by all null geodesic rays ending at o. The set of all events conjugate 
to the vertex o on those rays forms the caustic of C~ . C~ has the shape of a (hyper-) cone 
only between o and the first sheet of the caustic; thereafter in general it bifurcates and 
intersects itself. This is the (theoretical) reason for the phenomenon of multiple imaging 
in gravitational lensing. 

Consider an observer at the event o with 4- velocity U™, U"U oa = 1, and the past 
light cone C~ . Choose the affine parameter A of the rays ending at o such that (i) A = 
at o, (ii) X increases to the past, (Hi) at o, k a ll" = —1. Then, k a = is past- 
directed, and for events on C~ infinitely close to o, dA is the distance from o measured 
by the chosen obsever. The "new" k a is related to the wave vector introduced above 
by k a = — — k if uj q is the frequency associated with k a at the observer, k is purely 
kinematical, the same for all monochromatic waves which might be travelling in the 
direction — k . Let 70 be a ray, and let U a on 70 be the result of parallelly propagating 
U". Choose, along 70, orthonormal bases (Ei,E%) on the screens adapted to k a , U a , 
parallel on 70. The deviation vectors of the beam centered on 70 can then be written as 
Y a = —£iEi — £2-^2 — £,ok , then the screen components & (i = 1, 2) change according 
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to the deformation equation 

ii = , Sij = E™k a .pEj , 

where a dot denotes differentiation with respect to the affine parameter. In matrix 
notation we write 

k = S£ (2-la) 

The optical deformation matrix S is composed of Sachs' optical scalars of the beam (Sachs 
1961), i.e., its rate of expansion 

0(A) := \~K a (\) 

and its (complex) rate of shear, 

a(X):=h a , f3 (X)e* a (X)e*f(X) , e a := E? + iE« , 

according to 

c fAl . _/[«(A)-MA)] [Xmcr(A)] \ 

[lma(X)] [9(X)+TZea(X)]J ' ^' i0J 

Since & a = — 5 a is the gradient of the phase S, k a .p = kp ;a , and therefore S is a 
symmetric matrix. Differentiation of (2.1a) with respect to A gives 

£(\) = T(\)£(\) . (2.2a) 

where 

5 + S 2 = T . (2.26) 
Combining the last equation with Sachs' transport equations for 9 and a, 

9 + 6 2 + \a\ 2 = n , (2.2c) 

6 + 29a = T , (2. 2d) 
shows that the optical tidal matrix T is given by 

rm ._ ( [ft(A) " KeT(\)] [ImJ(A)] \ 

7 W •" ^ [ImJ(A)] [K(\) + TZeT(X)] J ' {Z - ZB) 

where 

TZ(X):=-^Rp 1 (X)P(X)~k^(X) , (2.2/) 

T(X):=-^C a ^s(X)e a *(X)k^(X)e^(X)k 5 (X) . (2.2p) 

Similar equations have been derived by Blandford et al. (1991) and Peebles (1993, Chapt. 
14). 1 The optical tidal matrix is symmetric due to the symmetry C a p^s = Cs-ypa of the 
conformal curvature tensor. Equations (2.2a,e,f&g) exhibit how the Ricci and conformal 



Note, however, that the component oc k a of Y a cannot be made to vanish for all A, contrary to 
the claim in Pebbles (consider equation (14.9) in his book, where £ J corresponds to 'our' Y a .) 
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curvatures govern the evolution of infinitesimal light beams; they are equivalent to the 
geodesic deviation equation (Jacobi equation) for screen vectors. 2 

The linearity of the Jacobi equation (2.2a) implies that the solution £(A) is related 
to its initial value £(0) =: by a A-dependent linear transformation 

t(\)=V(\)0 . (2.3) 

With the choice of A described above, 6 is the (vectorial) angle between 70 and a neigh- 
bouring ray. Because of (2.2a), £(0) = and £(0) = 0, V(X) is determined by 

f){\) = T{X)V{X) , (2.4a) 

V(0) = O , V(0)=1 (2.46) 
or, equivalently, by the linear integral equation 

V(X) = XI+[ dX 1 (A - A') T(X')V(X') . (2.5) 
Jo 

The Jacobi map (2.3) takes infinitesimal changes of ray directions at the observer back 
to a screen at an event of 70 given by the value of A. If that event is taken on some 
source "plane" z = const, V(X) corresponds to the properly scaled magnification matrix 
(in the terminology of SEF) of lens theory. Note that in contrast to S and T, T> is in 
general not symmetric. 
Equation (2.4a) implies: 

1) If T(A) is continuously differentiable k times, V(X) is continuously differentiable k + 2 
times; assuming k sufficiently large (which is permissable) justifies our later use of Taylor 
polynomials to study the local properties of V(X) at special points. 

2) V T V — V T V is a first integral of (2.4a). Since it vanishes in consequence of the initial 
conditions (2.4b), all solutions of (2.4) obey V T V = V T T>, provided T is continuous 
there. At discontinuities and 8 -type singularities of T this relation is preserved. 
According to the definitions given above, A c corresponds to a point p c conjugate to the 
vertex (observer) if and only if detX>(A c ) = 0. If the rank of V(X C ) is equal to zero, i.e. 
if T>(X C ) = O, all rays arriving at o have been intersecting to first order at p c ; if the rank 
of T>(X C ) is equal to one, the cross section of the ray bundle has been degenerating into 
an infinitesimal line segment at p c . In the first case, p c is called a focus (or degenerate 
conjugate point) of the caustic of Cq , in the second case, it is said to be a non-degenerate 
or simple conjugate point. 

Comparison of (2.1a) and the derivative of (2.3) shows that 

V = SV ■ (2.6) 

thus S can be obtained from V (see below, Sect. 3). With (2.6) we alternatively derive 
the symmetry of the 5-matrix from the 'basic' differential equation (2.4a) and the vertex- 
initial conditions (2.4b): at an affine parameter where V~ x and thus S exist, V T V = 
V T V is equivalent to S = S T . This also implies that at points where detX> = 0, the 
antisymmetric part of S is equal to zero. 

In equation (2.2g) one may write the full curvature tensor instead of C a fj^g; the Ricci part does 
not contribute. 
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Consider now the determinant of the Jacobi map. Prom its definition in (2.3) it 
follows that its absolute value is equal to the area 8A(X) of the cross section of the light 
beam at this affine parameter, divided by the solid angle 8f2 which that cross section 
subtends at the observer: 

|detP(A)| = ^ . (2.7) 

At a non-degenerate conjugate point the Jacobian determinant changes its sign; at a focus 
its sign is conserved, as will be shown in Sect. 3. Thus, detD(A) contains information 
about the area 5A(X) as well as the parity, i.e., the orientation of a beam at A relative 
to that close to the vertex. Between the vertex and conjugate points, the area 8A(X) is 
governed by Sachs' focusing equation: 

(y 7 ^) = [n(X)-\a(X)\ 2 ] y/AjXj . (2.8a) 

This ordinary differential equation has C 2 -solutions in any A-interval in which TZ(X) — 
|<r(A)| is continuous. This is the case except if the interval contains simple conjugate 
points, see Sect. 3. The initial conditions for the solution of (2.8a), which gives the area 
of the beam, are yM.(0) = and ^j^-(O) = i?, where Q is the solid angle of the beam 
at the observer. If there is an odd number of nondegenerate conjugate points between 
the observer and A, one has to take the negative root of A, otherwise the positive one. 
The driving term of the focusing equation, TZ — \a\ , is nonpositive: the Einstein field 
equation with an energy momentum-tensor of an ideal fluid yields a non-positive source 
of convergence 1Z; this also holds for a cosmological constant. Hence, equation (2.8a) 
describes how a light beam is focused at A due to the "local" curvature (Ricci-focusing) 
and due to its own shear rate at this affine parameter. Since this shear rate was produced 
by the source of shear J 7 at a smaller A, this implies that both, 71 and J 7 , yield a focusing 
of the light beam. Hence, as long as one considers only the area and not the shape of a 
light beam, the actions of 1Z and T are not distinguishable. In the following we do not 
consider the evolution of the area of a light beam, but that of 

w(X) := <SQ[det V] (A) = sign (det V(X)) v / |detP(A)| ; (2.9) 

the absolute value, \w\ (A) = -y|detX>(A)| = \J 5 "s^ , of this function describes the 
angular diameter distance along the beam considered, and the sign is the parity of the 
Jacobi map. From (2.7) we obtain that w also fullfills the focusing equation 

w(X) = \n(X) - \a(X)\ 2 ] w(X) (2.8b) 

between conjugate points; the initial conditions for w are: w(0) = and w(0) = 1. It is 
not clear a priori how to connect the solutions between conjugate points with each other, 
or whether one at all can integrate over conjugate points: the matrix S of eq. (2.6) and 
thus 9 and a become singular at the vertex and at a conjugate point A c . We investigate the 
behaviour of a light beam near the vertex and a conjugate point in the next Section and 
show that the solution of (2.8) is nevertheless well defined at conjugate points between 
source and observer. 
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3 The behaviour of light beams near vertices and conjugate 
points 



Preliminaries: Parametrization of a 2 x 2-matrix 

For our further discussion we parametrize a real 2 x 2-matrix A in terms of 3 'convergence' 
r, 'twist' co and 'shear' i~i and F 2 and write them as real and imaginary parts of complex 
numbers A and r, respectively: 



r [A] := ^ (an + a 22 ) , w [A] := 
A [A] := 1 (a n - a 22 ) , T 2 [A] 



1 , 

2 (ai2 



«2l) 
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yl [A] = r [A] + ico [A] , (3.1) 
(ai2 + a 21 ) , r[A] = A[A] + ir 2 [A] . 



Then, the trace of A is trA = 2TZeA [A] and its determinant is det A = \A[A]\ 2 
Note that transforming A with a proper orthogonal matrix (rotation matrix) 

R(0) = 



(3.2) 

\r[A]\ 2 . 



cos $ sin -d 
- sin $ cos -d 

to A' = J2 _1 AR leaves A invariant (A [A 1 ] = A [A]) and transforms r to r[A'\ = 
r[A]e 2l ' d . A and \r\ have an intrinsic, coordinate-independent meaning for the map 
given by A, whereas the phase of r fixes the coordinate-system to which A refers. We 
illustrate our definitions for S and 7~: 

A[S](X) = 0(A) e H , r[5](A) = -<7*(A) , (3.3) 

471(A) = ?e(A) G R , r[7l(A) = -J 7 * (A) . (3.4) 
If the argument of vl is the Jacobian matrix X>, we simply write A [D] =: 7l, and obtain for 



the derivatives with respect to A that A 



V 



(A) = A(X) and A V (A) = vl(A); analogous 



relations hold for r. This complex formalism is very convenient for matrix operations; 
e.g., we obtain for the multiplication of real 2 x 2-matrices A and B 



A[AB] = A[A]A[B] + r*[A]r[B] 
r[AB] = r[A]A[B] + A*[A]r[B] 



(3.5) 



To obtain the geometrical interpretation of vl and r, we consider the polar decomposition 
of V. If V ^ O, there exist unique numbers &i, 6 2 , with < b\ > 6 2 , and unique angles 
(f) and < (f) < it, < d < 2tt, such that 



V = R(0)B(b u b 2 ,<j>) ; 
i? is the rotation matrix which was already defined, and B is a symmetric matrix B 



B 1 



B = R{-4>) 









b 2 



R{4>) 



These names are chosen for convenience and are not intended to contain a geometrical meaning. 
In the case of the Jacobi matrix, the geometrical interpretation of A and r will be given below 
eq. (3.5). 
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or, that S is 



In the polar decomposition bi, 62 and # are the coordinate invariant numbers, (j) depends 
on the chosen coordinate system. The matrix B describes a rotation-free deformation, 
whereas R("d) rotates the plane by an angle -d. The relation of {A, r} to {61, 62, </>} can 
be derived with (3.5); we obtain: 

\A\±\r\ = b lt2 , A=^(b 1 + b 2 )e i » , 

r = \ (h - 62) e^-v ■ 

Inserting the values of bi and 62 yields that the 'twist' to is related to the rotation angle 
■d of the Jacobi map via 

A = e ^ tant? = - . (3.6) 

\A\ T 

3.1 Consequences of the symmetry of S 

1) Below (2.5) we have derived that V T V = V T D, or 00 V T V 
symmetric. Evaluating the twist part of V T V yields: 

im {A* a + r*r} = . (3.7) 

This constraint-equation is valid at every affine parameter and in particular at the vertex 
and at every conjugate point. Equation (3.7) illustrates that solving for V = TV, one 
has not 8 but only 7 free initial conditions. If one chooses the alternative way to solve for 
the light propagation - evaluating the optical scalars and than solving V = SV - then 
one has a priori only 7 free initial conditions and the constraint equation (3.7) is hidden 
in the nonlinear differential equations for the optical scalars. 

2) Consider a light beam in an intervall A G [A n , A n+ i] where T(A) = TZ(X)I, i.e. where 
the source of shear vanishes. Then every component of V satisfies the same differential 
equation, and the general solution V is a linear combination of two linearly independent 
solutions / and g of x = Tlx: 

V(X) = f(X)V n + g(X)V n+l , (3.8) 

where £>(A;) =: T>i and /„ = 1, f n+1 = 0, g n = and g p+1 = l. 4 Since g and / are 
linearly independent solutions, we also have j„ / and f n +i / 0. Inserting (3.8) into 
V T V = V T V and evaluating this matrix at A n yields V^ +1 V n = V^V n+ i. Hence we 
have shown: if there is no source of shear between A n and A n+ i, the matrix product 

Vl +1 V n (3.9a) 

is symmetric. If one matrix (say T> n ) is not singular, i.e. , there is no (to the vertex) 
conjugate point at A n , then the symmetry of (3.9a) can be expressed as the statement 
that the matrix 

V n+l V- x (3.96) 



This is a Sturm boundary value problem: the functions / and g exist if and only if the solution 
with x n = and x n = 1 satisfies x n +i 0. This condition is violated if and only if A n -|-i and X n 
correspond to a pair of conjugate points. 
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which carries connection vectors from A n to A n+ i is symmetric. This property has been 
used extensively in the proof of the magnification theorem in gravitational lens theory in 
Paper I. 

3.2 The Jacobi map near a vertex 

At the vertex, A = = T, T = = ImA and TZeA = 1. In this Section we do not 
investigate the behaviour of the optical scalars at the vertex since it is the same as that 
near a focus; this is due to the fact that locally a beam at a focus differs from that at 
a vertex only by the opening angle; this angle cancels out in the optical scalars because 
they are relative quantities. We investigate the Jacobi map in a Taylor expansion as a 
function of e = A — A v = A; we put 7~(0) =: To and obtain with (2.4) 

2?(e)=X e +ir e 3 + O(e 4 ) . (3.10) 

D 

Eq. (3.10) implies with the symmetry of To that the shear of the Jacobi map is at least 
of third order near the vertex, and the twist increases even slower at the vertex. In other 
words, the cross section of an initially circular light beam becomes distorted to an ellipse 
before it can get twisted. To compare the evolution of the shear of the Jacobian with 
its twist in more detail, we claim: if the first nonvanishing contribution to T is of the 
order e n , n > 3, at a vertex, the leading term of co is at least of the order e 2n (generically, 
n = 3). 

For the proof, we insert the Taylor expansions of Ti and T 2 into the constraint equation 
(3.7); this yields that the first nonvanishing contribution of this term is of the order 
2n. Inserting the Taylor expansions of r and co and using that the first nonvanishing 
contribution to r is of order one we find that, in order to satisfy the constraint equation 
at every order of e, the leading order of co must be at least 2n. 

Therefore, the twist co increases at the vertex very slowly compared to the shear; this 
explains that "not too far" from the observer, the light beam can not be twist-dominated, 
i.e. co 2 < \T\ holds. This slow increase also holds for the rotation-angle $ of the polar 
decomposition of the Jacobian matrix near the vertex, since with (3.6) tan$ = ^. With 
r(e) = e + 0(e 3 ) and co = ae 6 + C(e 7 ), the rotation angle $ = arctan — becomes near the 
vertex d (e) = ae> + 0(e 6 ). 

3.3 The light beam near a conjugate point 

Non-degenerate conjugate points A c are characterized by ^ |-T(A C )| = |j1(A c )|. Since A, 
but not r is invariant under rotation of the coordinate system, we can orient the latter 
such that r(X c ) = vl(A c ) at the conjugate point. At a focus, -T(A C ) = = vl(A c ). In the 
following we describe the light beam, as before near the vertex, in a Taylor-expansion 
around the conjugate point as a function of e := A — A c . We first derive properties which 
are common to both kinds of conjugate points; investigating the local behaviour of beams 
at a conjugate point, we are only interested in solutions of (2.4a) which obey the initial 
conditions (2.4b). 

Theorem: At a conjugate point x c an eigenvector belonging to the eigenvalue zero of V c 
cannot be a zero eigenvector to T> c . In particular, this implies that at a focus the rank 
ofV c is two, and at a non- degenerate conjugate point the rank ofT> c is at least equal to 
one. 
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Proof: Assume that there exists a conjugate point where X> c x = and X> c x = 0. Let 
£(A) = X>(A)x. Then this Jacobi field obeys £ c = 0, £ c = 0, and hence £ = and also 
£(0) = which is in contradiction to £(0) = 0^0. q.e.d. 

In order to derive Taylor expansions of detX>, 9 and o near conjugate points, we 
consider the differential equation V = TV. Using (3.5) we rewrite this linear matrix 
differential equation as system of coupled differential equations for A and -T: 



A -HA 



r-nr 



-T*A 



(3.11) 



which describe two coupled, planar oscillators with the same eigen-frequency and the 
same absolute coupling strength. Taking the n-th derivative of (3.11), one can iteratively 
calculate the Taylor-expansion coefficients of F and A in the (n + 2)-th order as a function 
of A c , r c , A c and r c (for the case of a non-degenerate conjugate point) or as a function 
of A c and r c (for the case of a focus). A conserved quantity of the differential equation 
system (3.11) is 

£ :=A*A-AA* +T*f-rf* . (3.12) 

Thus, if £ vanishes at one value A it vanishes everywhere, for any C 2 solution (A,T). 
(71 and T assumed in C°.) Using that the real part of £ is zero and that r and A have 
to fulfill the constraint equation (3.7), yields £ = for a physical solution of (3.11). In 
terms of r, A and their derivatives, a and 9 can be written as 
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AA* 



r*r 



Ar* - Ar* 



iri 



Ivll 



\r\ 



(3.13) 



provided the Jacobi map does not become singular; note that the reality of 9 is equivalent 
to £ = 0. Therefore, one can obtain the series-expansions of 9 and a by inserting the 
expansions of r and A which are derived from (3.11). 



The light beam at a focus 

At a focus, A c = = r c ; from our theorem we know that 
of V c would be smaller than two). We obtain from (3.11) 



A c 



3 

A(e) = eA c + j (tZ c A c - T c t c ) + C{e A ) 



r(e) = er c + j (n c r c - f* c a { 

and thus the determinant of the Jacobian becomes 



detD(e) 



1 + — 7l c 
3 



det V c + 0(e 5 ) 



(otherwise the rank 

(3.14a) 
(3.146) 

(3.15) 



Since detV c ^ 0, the leading term of detX'(e) is of second order. The optical scalars 
become near the focus 
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1 + -7l c e< 
3 



+ 0(e 2 ) , a=-T c e + 0(e 2 ) 



(3.16a&6) 



The function w defined in (2.9) is equal to 



10 



w(e) = sign (detX> c ) |e| yj detV c (l + C(e 2 )) 



(3.17a) 



thus, it is continuous but not C 1 at the focus; w has a finite discontinuity. One obtains 
the expansions of r and A near the vertex by inserting the special values r v = = XmA v 
and 7?.eyl v = 1 into (3.14). As expected we obtain for w at the vertex: 



w(e) = e(l + 0(e 2 )) , e>0 



(3.176) 



As already claimed, the optical scalars (3.16) have the same structure at a vertex and 
at a focus, since the expansions of the light beam around these points differ only by the 
opening angle det V v = 1 and det V c which cancels in the numerator and denominator of 
9 and a. Note that in lowest order (e _1 ) the behaviour of 9 and a at the vertex (focus) is 
expected: the infinitesimal neighborhood of such an event can be treated asymptotically 
as the flat Minkowski spacetime. In Minkowski spacetime, however, 0(A) = j and 
a (A) =0 holds for all A, in particular at the vertex. The first order terms in 9 and a 
demonstrate that the source of convergence 7Zo < at the vertex (or focus) decreases 
the divergence of a beam, and that the source of shear produces a shear rate a: 



cr(0) = 3^0 



(3.18) 



This implies: F = <r(0) = 0, and with (2. 2d), T = a = 0. Thus, a 

beam centred on 70 is shear free, if and only if the tangent vector of 70 is one of the at 
most 4 principal null directions of the conformal tensor, a rare, exceptional case. Thus 
generically u/O. 

The fact that a = at the vertex implies that the coefficient of the rhs of the 
focusing equation (2.8) is continuous at the vertex; thus its solution w is well defined at 
least from the observer to the first conjugate point. 

The light beam at a non- degenerate conjugate point 

At a non-degenerate conjugate point, the local expansion of the beam is determined by 
r c = A c 7^ 0, A c and r c . Since the constraint equation (3.7) has to be satisfied, there 
are only five free initial conditions: let a and b be the unique complex numbers which 
satisfy A c = aA c and r c = bA c ; then (3.7) yields Xm [a + b] = 0, ^> TZe [a — b] = a — b* . 
The zero eigenvector of V c is not a zero eigenvector of V c if and only if TZe [a — b] ^ 0; 
therefore TZe [a] ^ 7Ze[b]. With (3.11), the expansions of A and r near the conjugate 
point can be written as 



A(e) 



l + ae+-e 2 (R c - T c ) 



A c + C(e 3 ) 



(3.19a) 



l + be + -e 2 (n c -F* c ) 



The determinant of the Jacobian matrix is equal to 



det 23(e) = 2^e [a - b] |vl c | 2 e + 



A c + 0(e 3 ) 



A c \ 2 e 2 + (D(e*) ; 



(3.196) 



(3.20) 



thus, the leading order of this expansion is equal to one. For the optical scalars, we 
obtain, 
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hence, the rate of shear is real (in the chosen coordinate frame, in which A c = r c ) in 
zeroth order, it becomes imaginary in first order if and only if T c is a not real. The 
function w is 

w (e) = \A C \ sign (He [a - b] e) y/\2He [a-b]e\ (l + C(e 2 )) 
or with the abbreviation d c = det V] A : 

w(e) = sign (d c e) (l + C(e 2 )) . 

Thus at a non-degenerate conjugate point w is continuous, changes its sign, and has an 
infinite first derivative. 

Now we summarize the results for the behaviour of the determinant of the Jacobi 
map and the optical scalars near conjugate points: 

(1) at a non-degenerate conjugate point, detX> oc e, 9 = l/2e, a oc l/2e; in leading order, 
and 

(2) at a focus, detV oc e 2 , 6 = 1/e, a = 0. 

With our knowledge of the behaviour of the shear rate a at a conjugate point, we now can 
prove that the focusing equation (2.8b) is integrable over the singularity at a conjugate 
point: In the worst case, that rhs of (2.8b) behaves like \a\ y/det V oc e -3 / 2 ; this yields 
w(e) oc sign(e) \f\e\. Thus the solution is well defined, even for the case where there is a 
conjugate point between source and observer. The behaviour of the determinant of the 
Jacobian map at the two different types of conjugate points also varifies that the sign of 
w from (2.8) changes only at a non-degenerate conjugate point, as was claimed in Sect. 2. 
Our results also show that the points of 70 conjugate to the vertex form a discrete set. 

4 The derivation of the gravitational lens equation from 
geometrical optics 

So far, no approximation was used. To evaluate the propagation equation (2.4) in an in- 
homogeneous universe requires several approximation assumptions. These will be stated 
in this chapter, and used to rederive the basic relations of the standard gravitational lens 
theory formalism from general relativity. 

The Friedmann universe 

If one assumes that the universe is isotropic and homogeneous, then its metric is given by 
the Robertson- Walker-metric. The only non-vanishing components of the metric tensor 
then are g tt = c 2 , ga = -R 2 (t)ga, with g rr = jz^, 9ee = r 2 and g^ = r 2 (sin(9) 2 ; the 
value of k = 0, +1, —1 determines whether the space is flat, spherical or hyperbolic; t is 
the cosmic time. A fundamental observer with four velocity U a (X) at an event A on the 
central ray of a beam measures the frequency u)(\) := ck a (X)U a (X) = — u>ok (X)U a (X) =: 
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uiq (1 + z(X)); k a is the wavevector of the central ray, coq is the frequency at the vertex 
of the beam and z(X) is (by definition) the (red)shift. In a Robertson- Walker-metric, 
the redshift is isotropic and is related to the scale factor of the metric by R(z) = 
where Ro is the scale factor at the vertex of the beam (z = 0, t = to). The affine 
parameter-redshift differential equation is 

Note that this yields a proper distance- affine parameter relation at redshift z of 

dDproper = (1 + ^)dA , (4.2) 

which is consistent with our convention that the affine parameter equals the proper length 
at the vertex at A = = z. For a Friedmann universe with zero cosmological constant 
and an energy momentum tensor of a matter-dominated ideal fluid, p pc 2 , equation 

(4.1) can be solved by inserting the Friedmann equation for jj^j'- 

A(z) = -£-[' ^7== ; (4-3) 

#o Jo (1 + z') ^Qz 1 + 1 

Ho is the Hubble parameter d(lni?)/d£ at the observation event to- 
Parallel transport in a Robertson- Walker spacetime 

To calculate the source of shear defined in (2.2g), we need the screen vectors Ef, i = 1,2, 
and k along the central ray. We choose the center of the spatial coordinate system 
(r, 9, <j>) at the observer, and the central ray 70 connecting source and observer in the 
direction of 6 = ^. Consider the dimensionless function 



dx 



Vl — kx 2 



It solves the eikonal equation; the hypersurface T(t,r) = T(t',0) defines the past null 
cone of (t',0). Therefore, the phase functions converging on the world line r = are 
all given by S(t,r) = f (T (t,r)), where / depends on the phase S(t,0). The vector k a 
(which is on Cq) has to be a constant multiple of T a = (i? _1 (t), l/y/l — kr 2 , 0, 0); 5 since 
ko = — 1 at the vertex, we obtain 

~k a (z) = -(1 + z) 1, , == ,0, 



and thus 



k a {z) = (1 + z) 



Vl — kr 2 



1 , = , 0, 

V-9rr{z) 



(4.4) 



The spacelike screen vectors E" and E% adapted to k can be chosen at the observer 
proportional to [0, 0, 1, 0] and [0, 0, 0, 1]. For general z we then obtain 



The components of a four vector x a are x° = ct, r, 6, <f>. 
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1 



[0,0,1,0] 




1 



[0,0,0,1] 



(4.5) 



The components of the vectors Ef (0) become singular at the observer at z = 0. This 
is due to the choice of the coordinate system; the vectors themselves and their inner 
products are regular. 

The on- average Friedmann universe 

Of course, a homogeneous universe is not realistic. A better model must take into account 
that only a fraction < a < 1 of the matter is distributed homogeneously, whereas the 
rest is concentrated in clumps. Imagine a model universe that is inhomogeneous on 
small scales and homogeneous on large scales (some 100 Mpc's) such that this clumping 
of matter does not affect "global" (or large scale) functions like R(t), R(z), X(z) and the 
parallelly transported fields k a (z), E"^(z). This means that, on average, this universe 
behaves like a Friedmann universe with density pp which has the same total matter 
content as the actual universe. Thus, such a model is called an on-average Friedmann 
universe (see, e.g. Zeldovich 1964, Dyer & Roeder 1973). 

This picture of the matter distribution in our universe is a realistic one if one is 
interested in the light deflection caused by 'strong', isolated matter inhomogeneities, such 
as galaxies and clusters of galaxies, the deflectors which produce multiply-imaged QSOs, 
radio rings, and luminous arcs. In these situations, it seems to be a fair approximation 
to consider the light beams between us and the deflector, and between the deflector and 
the source to be nearly unperturbed by matter inhomogeneities; if there is more than 
one deflector along the line-of-sight, this can be accounted for in the present prescription. 
An alternative view of the matter distribution in the universe is provided by considering 
larger scales, on which the density inhomogeneities are linear or quasi-linear. Then it 
is more realistic to model the matter distribution as a field 8p which is superposed on 
the Friedmann density pp, such that (5p) = 0, and the average is taken on spatial scales 
which are small compared to the Hubble length, but larger than the largest scale on 
which the density fluctuations 8p still have appreciable power (see, e.g., Gunn 1967, 
Blandford et al. 1991, Kaiser 1992 for studies of light propagation in such a weakly 
inhomogeneous universe). In the following we adopt the first view, that of a clumpy 
universe; we note, however, that most of our results derived below also apply for the 
weakly inhomogeneous universe. In particular, the (multiple deflection) gravitational 
lens equation can also be used in the latter case, if the universe is 'sliced' into redshift 
bins and the matter inhomogeneities are projected onto 'lens planes' in the bins, since the 
multiple deflection gravitational lens equation can be considered just as a discretization 
of the exact propagation equation (2.4). The only modification that has to be applied 
in the case of a weakly inhomogeneous universe is that 7Z C \ no longer is nonpositive, and 
the projected surface mass density S in each lens plane can attain positive and negative 
values. Furthermore, since the magnification, defined in Sect. 5 below, is defined relative 
to the Friedmann-Lemaitre universe, the mean magnification relative to that must be 
unity (see the discussion in Sect. 4.5.1 of SEF), and the focusing theorem of Sect. (5.12) 
no longer holds, since 1Z c i can have either sign. 

4.1 The sources of shear and convergence for weak, isolated 
inhomogeneities 
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Weak, isolated inhomogeneities 

We assume that inhomogeneities like galaxies or clusters of galaxies are isolated from 
each other such that in each domain containing an inhomogeneity, small compared to 
the Hubble distance, the metric can be approximated by a post-Minkowskian line element 

ds 2 = (l + 2 J) c 2 d* 2 - (l - 2 J) dx 2 . 

The relative velocities of its mass distribution are small, v <^ c, and its Newtonian 
gravitational potential ^ is weak, # C c 2 . If the density outside such regions is app and 
we write for the density inside a clump app + p c \, such that p c \ is localized in the region, 
Poissons's equation A 3 <P = 4nGp c \ holds within the region. 6 The metric does not change 
appreciably on the time scale light needs to propagate through the inhomogeneity. We 
therefore call such inhomogeneities quasistatic, weak inhomogeneities. 

The source of convergence 

First we consider the source of convergence 7Z, defined in (2.2f). Inserting the field 
equations with an energy-momentum tensor of an ideal fluid yields: 

K = -^pU a U l3 k a kP . (4.6) 

In this equation, U a is the four velocity of the ideal fluid, which deviates from the velocity 
in a pure Friedmann universe U a by the peculiar velocity ?7p ec , U a = U a + U£ ec , and 

k a is the wave vector of the central ray of the beam considered, which deviates from 
the wavevector k a in a Friedmann universe due to deflection in the inhomogeneity by 
a vector 8k , K a = k a + 8k . The matter density p = pbg + Pel is given as a sum of 
the reduced background density in the on-average Friedmann universe, pbg = &Pf, and 
the matter density of the clump p c \. If we use that peculiar velocities of inhomogeneities 
(e.g. , galaxies) are small, v pec ^ 10 _3 c, and that their gravitational fields are also small, 
$Zg <C 1, we can neglect the contributions from U™ ec and 8k a and obtain from (4.6) 

that in lowest order, with 7Z = 7Z\j g + 7Z C \, the contribution of the clumps is given by 

Kci~-^PciU a Up~k a ~kP . (4.7) 

Consider an inhomogeneity along a ray 70 localized in the affine parameter interval 
[Amin, A m i n + AX] which is small compared to its distance to us: AX A m i n ; let be 
an element of the corresponding redshift interval [z m \ n , z m \ n + Az] . Since the inhomo- 
geneity must not change significantly during the time the light beam traverses it, we 
can calculate (4.7) for one instant of time, t(zd). The line element in the asymptotically 
flat neighborhood U of A(z d ) is ds 2 = (l + 2f ) (cdt) 2 - (l - 2 J-) (dC 2 ), with t = R d n, 
-finder 2 , m (d£) 2 and R(zd) = R^; (t,C) denote Post-Minkowski-coordinates centered 
on A(za) and oriented such that £3 is parallel to the spatial direction of 70 there. We 
calculate 7Z and T not only on the central ray 70 of the beam considered, but for all 
spatial positions £ in U. This yields 7Z and T for all rays traversing 14, where the spatial 

6 Concerning the difficult problem of constructing approximate solutions to Einstein's equations 
containing quasi-static, weak inhomogeneities seperated by 'empty regions' and being Friedmannian 
on a large scale, see Futamase & Sasaki 1989, Jacobs et al 1993; see also Kasai 1993. 
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paths of the rays are parametrized by C(A); n °te that the rays in U do not have to be 
infinitesimally near to 70 in the sense of (2.3). The source of convergence on a ray in U 
is the sum of 

^bg(A) = -^f Phg (z) [1 + zf , TZ cl (X) = — ^ p c ,(C(A)) [1 + zf , (4.8a) 

where we have written z instead of z(X). If one uses the Poisson equation A 3 <P = 
4nGp c i for the quasistationary Newtonian gravitational potential <5(£d, C) ~ ^(C) °f the 
inhomogeneity, this yields for TZ c i: 

^ci(A) = -il±£L^(C(A)) . (4.86) 

Up to now we have considered weak inhomogeneities which are small in size compared 
to their distance to us. Now we will restrict ourselves to those which are sufficiently 
thin, such that one can replace the wavevector and the vectors E a in (2.2g) by (4.4) and 
(4.5) evaluated at the redshift of the clump. (That is, for the calculation of the source 
term for the evolution of the light beam, one can neglect the deflection relative to the 
unperturbed light beam). Thus we approximate 

C(A)«(Ci(A d ),C2(Ad),C3(A)) (4.9) 

for rays which are roughly parallel to 70 at A(zd); the deviation of rays from the parallel 
direction must be small, as well as the typical deflection angle caused by an inhomo- 
geneity, otherwise the approximation (4.9) would break down. 7 With our choice of the 
coordinate system, Qi and C2 are orthogonal coordinates on the screen defined in Sect. 2. 
Therefore we write (Ci(Ad), C2(Ad)) = £ in the following; £ is a parameter to label rays. 
Equations (4.8) hold for an infinitesimal beam with central ray 70 (£ = 0), and for any 
other ray which is in U and roughly parallel to 70 at Ad- 

The approximation (4.9) is equivalent to one on which graviational lens theory is 
based: there, the source term for the light bending along the deflected light ray is ap- 
proximated locally by that evaluated along the path of the unperturbed ray. 

The source of shear 

Outside the matter inhomogeneities, where pbg = &Pf, we neglect the source of shear 
due to clumps; i.e., we neglect the long-range gravitational action of the weak in- 
homogeneities, and put T = 0. At the inhomogeneity we evaluate (2.2g) in post- 
Minkowskian coordinates , hence we have to transform the coordinates from (x°,r,9,(j)) 
to (x°, £1, (2, (3)- Note that we have chosen the (^-direction of the new coordinate system 
parallel to the spatial direction of the central ray. Since the normalization of all vectors 
stays invariant under the transformation of the coordinate system and since the norm in 
the local Minkowski-system is built with 77 = diag (1,-1,— 1,-1), we have to replace the 
metric tensor g by 77 in (4.4) and (4.5) and we obtain 

k a (z d ) = -(l + z d ) [1,0, 0,-1] , ^(zd) = [0,1,0,0] , E%{z d ) = [0,0,1,0] . 

(4.10) 

In astrophysically relevant situations, the beams under consideration have an opening angle of 
~ 1 arcminute ~ 3 X 10 — 4 for galaxy clusters, and of ~ 10 arcseconds ~ 5 X 10 — 5 for lensing by 
individual galaxies; the corresponding typical deflection angles are of the same order or smaller. 
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The Riemann-tensor in the post-Minkowskian-approximation in ct and ( coordinates is 
equal to 

RaP^S = {$OL7&,f35 ~ 8f3^, a 5 ~ #a<5^,/3 7 + 5/3<5^,a 7 } ■ (4-11) 

Thus, (4.10) and (4.11) yield that there are only contributions to the source of shear 
in (2.2g) if a, 7 G {1,2} and (3,8 G {0,3}; hence the summation contains only 16 
nonvanishing contributions. Using the quasistationarity of the metric, # ; o *C yields 
that in lowest order of ^, only the following eight components of the curvature tensor 
contribute to (2.2g): 



1 . „ 1 T „ 1 

(#33 + #11 

(4.12a) 



-RlOlO — — , -^1020 — -R2010 — — ^2^' 12 ' -^1313 — — ^(^,33 + ^,ll) , 



-^2020 — ^^,22 , -^1323 — -^2313 — 9^,12 , -^2323 — 9 (^,33 + ^,22) 

C z C z C z 

(4.126) 

Inserting # ;i2 = #,21, (4.12) and (4.10) in (2.2g) and using (4.9) yields 

^ci(^;A) = ^(l + ^) 2 {# 11 -# 22 -2i# 21 } «;C 3 (A)) • (4.13) 

Therefore we obtain with (4.13), (4.8a) and (4.8b) that the optical tidal matrix along a 
family of rays traversing an asymptotically flat neighborhood of an event Ad localized in 
a weak geometrically-thin clump in an on-average Friedmann universe, such that their 
spatial directions are roughly parallel to the (^-direction at Ad, is 7~(£; A) = 7bg(z)+7d(A) 
with Tbg(z) = 7£bg(z)^ and 

(r c ,) jfc (^;A) = -^^[2(# ifc ) + (^, 33 )](^C3(A)) , i,ke {1,2} . (4.14) 

Thus, the optical tidal matrix is simply related to the ordinary tidal matrix, i.e., the 
matrix of the second derivatives of the Newtonian potential. In these equations, z = z(X), 
and £ is the screen position of the ray considered at Aj relative to one chosen ray 70 of 
the family; £3 is the direction in the post-Minkowski coordinate system parallel to the 
rays at Ad, hence with (4.2) 

dC 3 = (l + ^)dA . (4.15) 

If one evaluates the mapping of an infinitesimally thin beam (i.e., one needs the value of 
(4.14) on one ray 70 only), one puts £ = in (4.14). 

4.2 The thin lens approximation 

One of the simplifying assumptions underlying lens theory is that the inhomogeneities 
are geometrically thin. Thus one approximates the inhomogeneities by two-dimensional 
surface mass densities U. Let one of the distributions be situated on the 'plane' (3 = 0, 

Pci(*,C 3 ) , (4-16) 

where 

/+00 

dC3Pci(£,C 3 ) (4.17) 
-00 
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The Newtonian potential of this distribution is 



*{(,(,) = -Of / £( " )d " • (4.18) 

1 \/«-i) 2 + Ci 

The derivatives <&,ik, ^ 33 which occur in the tidal matrix (4.14) decrease like the inverse 
third power of the distance from the plane of the mass distribution. It is, therefore, 
reasonable to approximate the optical tidal matrix for a clump of matter as a delta- 
distributional source term in A: 

/+00 
r c ,(^;C3(A / ))dA / . (4.19) 
-00 



The deflection potential SI/ 

The deflection potential of an inhomogeneity is defined as usual by 




(4.20) 



(see SEF, Sects. 4.3 & 5.1). In the deflection potential, the denominator in the argument 
of the logarithm is an arbitrary length, to make this argument dimensionless; we have 
choosen it equal to the so-called empty cone angular diameter distance D d := D(z d ) from 
the observer to the redshift z d . Under a change of this length scale, the value of (4.20) 
changes only by an unimportant additive constant. It is straightforward to see that W 
and £ are related to each other by the Poisson equation for the surface mass density 



8nG 



(4.21) 



where A 2 is the two-dimensional Laplace operator. 

We now show that the approximate tidal matrix of eq. (4.19) can be expressed in 
terms of the second derivatives of the deflection potential rather than in terms of the $- 
derivatives. In fact, using eqs. (4.18), (4.20) and (4.15) one verifies by a straightforward 
calculation that, for i, k 6 {1-2}, 



dC3*,ifc(*,C3) = -^,ifc(*) 



/oo 
dC3 <2>,33 (£,C 3 ) =0 
-00 



(4.22) 



(4.23) 



Therefore, eq. (4.18) leads to 
72«; A) = -(! + *„) 5(X-X d ) 



|#,2l(£) £,22(*) 



-(l + z d )5(\-\ d )U(£) . (4.24) 



In the last step, we have defined the deflection matrix U(£) as the Hesse-matrix of the 
deflection potential U(£) =: U [#1 (0- 
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We can generalize the result (4.24) to the case of several inhomogeneities, i.e., for the 
following case, which also is the "standard situation" in gravitational lens theory: given 
an observer at redshift zero in an on-average Friedmann universe, a source at redshift 
z s =: Zjv+i an d an arbitrary number N of geometrically-thin, weak inhomogeneities be- 
tween source and observer, situated at Ai,.., Ajv with corresponding redshifts of zi,..,zn- 
Then, if we indicate the two-dimensional screen positions of a ray (relative to one ray 
70 of the family) in the inhomogeneities with £j and the deflection matrices at those 

positions as Uj(£j), the optical tidal matrix is equal to 

N 

r p (^!, ...,£ N] a) = n hg (\)i - J](i + Zi ) Ufa) s(\ - x % ) ■, (4.25a) 

i=l 

the different rays considered must be roughly parallel to each other before the first inho- 
mogeneity, then, the same holds at every following inhomogeneity provided the deflection 
angles are small. Again, considering only one infinitesimal beam with central ray 70, one 
has to consider 

N 

r p (A) = n hg (x)i- 5^(1 + Zi ) Ui(o) 5(\ - x { ) . (4.256) 

2=1 

4.3 The recurrence relation for the mapping of the light beam 

The equations (4.25) result from well-defined assumptions and approximations. Hence we 
can solve the differential equation (2.4a) with (4.25) as source term. We again consider 
not only a single beam, but a family of beams with (nearly) parallel central rays, and 
label a beam by the screen position £ n of its central ray relative to one reference ray 70. 
Defining V+(£ n ) := lim A \ < A„ V(£ n ; A) and X>"(£J := rim A ^ Ari V{£ n ; A), this yields: 

Vt(U-V-(U = -(l + Zn)Un(UVn(U J (4.26) 

thus the Jacobi matrix, but not its derivative is continuous at an inhomogeneity in lens 
approximation. On the lhs of (4.26) we want to express the derivatives of the Jacobi 
matrices as functions of the values of the Jacobi matrices at redshifts z n _i, z n and z n+ i. 
In order to do this, we first have to determine the evolution of an infinitesimal light beam 
outside clumps. 

The evolution of a beam outside clumps, Dyer-Roeder differential equation 

We now investigate the evolution of a beam outside clumps, which we call empty beam 
or empty cone in the following. Since outside of clumps the source of shear vanishes, the 
differential equation (2.4a) simplifies with the first of (4.8a) to 

V(X) = TZ hg (X)V(X) = - — ^(z) [1 + zf 2>(A) . 

If we insert the evolution of the density with redshift, Pbg(^) = ctpo(l + z) 3 , the definition 

of the density parameter Q = with p cr i t = 3— A, we find that each component of V 
fulfills the differential equation 
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dA 2 w 2 \H 

Using the affine parameter-redshift relation (4.3), this finally transforms to the Dyer- 
Roeder differential equation (Dyer & Roeder 1973) 



7 n n 

-Oz+ — + 3 
2 2 



dB(z) 3 



dz 2 



+ -aOB(z) = . (4.27) 



This second order differential equation has two linearly independent solutions; two solu- 
tions B\ and B 2 are independent if and only if the Wronskian W(z) := B\B 2 — B 2 Bi(z) 
is different from zero at one value of z (and thus for every z). The first and second 
terms of equation (4.27) describe the evolution of a light beam due to the expansion of 
the universe, therefore Q appears; the third term describes the convergence of a light 
beam due to the local homogeneous matter density app in the empty cone (no clumps); 
for this reason, a term afl occurs. Consider a solution D(zi,z) of (4.27) which is zero 
at redshift Z{ and whose derivative with respect to redshift obeys the local Hubble law, 
or equivalently, the infinitesimal quantity ^jdz equals the infinitesimal proper length 
d-Dproper(^) at redshift Z{. Then D(zi,z 2 ) is the empty cone angular diameter distance 
from redshift Z\ to z 2 ; it can be described by a function r(zi,z), solving (4.27) with 
boundary conditions 

£-r(zi,z)\ z=Zi = ^/rr-x f > r(zi,z)\ z=Zi =0 , (4.28) 

dZ (1 + Zi) z y/S2zi + 1 

in the following form: 

D(z 1 ,z 2 ) = ^-\r(z 1 ,z 2 )\ . (4.29) 

The general solution of this initial value problem is provided in Seitz & Schneider (1994). 
If there is no inhomogeneity in the beam between its vertex and redshift z, the Jacobi 
matrix T>(z) is given by T>{z) = j^r(0, z)I; in particular, at the first inhomogeneity at 
zi, V(zi) = D(0, zi)I. To describe the solution of eq. (2.4a) between the (n — l)-th and 
n-th and between the n-th and (n + l)-th inhomogeneity, we put: 

V(z)=X 1 B 1 (z)+YB 2 (z) , z6[vi,4] , (4.30) 

V(z) =X 2 B 1 (z) + ZB 2 (z) , ze[z n ,z n+1 ] . (4.31) 

Here, Bi and B 2 are linearly independent solutions of the Dyer- Roeder differential equa- 
tion; we choose them as B\(z) := D(0,z) =: D{z) and B 2 (z) = D(z n ,z). Xi, X 2 , Y and 
Z are real 2 x 2-matrices, determined by the boundary conditions. Evaluating (4.30) and 
(4.31) at z n immediately yields X := X\ = X 2 = ^-T> n =: A n Then, we calculate the 
derivatives of (4.30) and (4.31) with respect to A, evaluate these at A n and obtain with 
(4.3) and (4.28) the difference: 

V+-D- =Z±D(z n ,z)\ z ^ Zn - Y ^D(z n ,z)\ z ^ Zn = (l + z n )[Z + Y] . (4.32) 

The matrices Y and Z can be calculated by evaluating (4.30) and (4.31) at z n _i and 
z n+ i, respectively. With the abbreviations D(z{,Zj) =: Dij and D(zi) =: Di this yields: 
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Y = < V n _ x —V n ) , Z = I V n+1 —V n \ . (4.33) 

D n ,n-1 V D n ) D njn+ i { D n ) 

We insert (4.33) and (4.32) into (4.26), use the Etherington (1933) reciprocity relation 

D(zi,z)\z=z 2 _ D(z2,z)\ z=Zl 

1 + Z! ~ 1 + Z 2 

and obtain for T> n+ i: 



D n+ i — —D njn+ iU n V n " 1 

Dn-1 D 



(4.34) 



However, this relation is equivalent to the recurrence relation for the Jacobi matrices in 
lens theory. This becomes clear, if one rewrites this equation, as common in lens theory, 
in dimensionless form. One has to insert the dimensionless deflection matrix ?7(x) related 
to U(i) via 

U j (x 3 )= Dj £ +lDj m j ) ; t; :x 7 /; 7 , 

and the definitions of the dimensionless Jacobi matrices AJxi) := -jk-Vi(DiXi). Defining 
the geometrical quantities 



Vi := , < i < N , Vi :=-A — — , 1 < ? < N 

c V 

and 

Ai==#^ ±L > l<*<i<iV + l , 

as in Paper I, this yields 

A n+1 = -/3 n!n+1 C/ n A n -v 2 n A n _ x + (1 + ^)^ n = T n A n - vlA n _± , (4.35) 

where the 2 x 2-matrices T n are defined as T n := (1 + v^)I — /5 n;n+ iC/ n , 1 < n < N 
and the starting condition is A\ = I. This is the same recurrence relation as that in 
gravitational lens theory, see e.g. eq. (2.21) of Paper I . Hence we have shown that the 
recurrence relation for the mapping of the Jacobi matrices in lens theory can be derived 
as a direct approximation from geometrical optics. 

4.4 The deflection angle, the lens equation 



We have seen that light propagation for infinitesimal light beams can be derived from 
geometrical optics. Can one also derive the lens equation and the deflection angle from 
geometrical optics? Yes, provided that the matter outside the clumps is homogeneous 
and the source of shear due to the clumps is assumed to vanish outside of the clumps. 
Therefore, the mapping between two consecutive lens planes can be considered to be 
linear on a large scale, i.e., not just for infinitesimal beams, but also for 'fat beams' 
(which of course have to be smaller than the typical separation between clumps). 
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Consider two rays 70 and 71 including an angle at their intersection point at the 
observer, where this angle is small enough to ensure that these rays are approximately 
parallel, but not necessarily infinitesimally small. 

We treat one of them (70) as a reference ray, adapt a screen to it (as defined in 
Sect. 2) and denote the screen position of 71 at redshift z by $}{z). We calculate the 
evolution of this separation vector from the observer (z = 0) to the source at z s = zjv+i 
in two steps: 

1) Due to the remark above, the separation vector has to satisfy the Jacobi deviation 
equation (2.2a) with the source term T = HbgZ outside inhomogeneities. Hence, each 
component of this separation vector has to satisfy the Dyer-Roeder differential equation 
(4.27). Thus, if we indicate the screen position of 71 (relative to 70) at the j-th inhomo- 
geneity by we can describe this separation vector between the (n — l)-th and the n-th 
lens plane by 

t l (z)= r l Zn - UZ \ t , Zn-l< z<z n . (4.36) 

T \Z n , Z n — \ j TyZ n — \,Z n ) 

Note, that r(z n ,z) and r(z n _i,z) form a pair of linearly independent solutions of the 
Dyer-Roeder equation and that inserting z n and z n _i yields the correct boundary con- 
ditions. 

2) If there was no inhomogeneity at redshift z n , (4.36) would stay valid also for z > z n . 
But since there is an inhomogeneity, we have to correct for this and we have to take 
into account that for z > z n , the optical tidal matrix again becomes T = TZbgZ- The 
correction function has to be a solution B(z) of the Dyer-Roeder equation. Thus we 
obtain 

f {Zn,Z \ &-i+ r l Zn - UZ \ tl-B{z)c n {e n ) , z n <z<z n+l . (4.37) 

T \Zn 1 %n— 1 ) V \Zn—li ) 

c n is a non-zero vector quantity, therefore B must vanish at z n . We can choose the 
derivative of B at z n such that 

^ i U=A„ = (l + ^n) • (4.38) 

holds, and thus B(z) = D(z n ,z). 
The deflection angle 

We define the derivatives of the separation vector of the two rays with respect to the 
affine parameter, before and after the n-th inhomogeneity: 

lim ±£\\ n + A\) , := lim ^(X n - AX) . (4.39) 

^ ax\o dA a\\o dA 

Since d_D proper = (1 + ^ n )dA for an observer at z n , 

e ou t = (1 + ^n) _1 ^n+ and e in = (1 + ^n)" 1 ^- ( 4 - 40 ) 

are the angular directions of 71 relative to 70 before (e; n ), and after traversing the inho- 
mogeneity (e out ), respectively. We use (4.36), (4.37) and (4.38) to obtain 
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k 1 -k 1 



lim — B(X n + AX)c n (^ n ) = -(1 + z n )c n (£ n ) 
A\\o dA 



(4.41) 



with (4.40) this becomes 

(e ou t " e in ) = -c n (£j . (4.42) 

Hence, c n (£ n ) is the difference of the deflection angles at the screen position £^ and the 
reference ray position (£ n = 0). We now calculate the value of the vector c n (£^J as a 
function of the surface mass density E of the inhomogeneity and show that it is equal to 
the difference of the deflection angles cx n (£jj — « n (0) used in lens theory. 

Consider a family of rays forming an infinitesimal beam with central ray 71; we 
denote their screen vectors in the n-the lens plane by £ n = + A£ n and their angular 
positions relative to 71 at the observer by AO. Discussing the Jacobian map of this 

infinitesimal beam Vni^l) = dA ^n an d its derivatives 

oAu 



dAQ ' nv ^ n/ dAQ 
at the inhomogeneity, we obtain with (4.41) for the difference of these matrices 

"<9c n (0' 



t>+(e n )-T>-(e n ) = -(i+z n ) 

= "(I +Zn) 



dAQ 



^ n ^» n 



8Cr 



\dA0 J 

On the other hand, we have from (4.26) 
This implies 



^» n ^» n 



(4.43) 



J ^ n ^ n 



^» ii ^ n 



n 



(4.44) 



for every £ l n and therefore, 



n 



c n (£ n ) = V £ + const 



(4.45) 



The additive constant has to be — V & ^n(O); this can be obtained from the limit — » 0, 

^ n 

i.e., the case where the ray considered coincides with the reference ray: for this ray 
£(z) = 0. Therefore, we finally obtain with (4.18) 



C n(£j 



AG 



Ju 2 



- g) (0 - g) 



d^' = a B « B )-a B (0) 



(4.46) 



as claimed before this is the difference of the deflection angles of the ray considered and 
the reference ray. 
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The lens equation 

Evaluating (4.37) at redshift z n+ i, inserting B(z n+ i) = D(z n , z n+ i), (4.46), (4.29) with 
r(z n} z n _i) = — \r(z n} z n _i)|, Etherington's reciprocity relation, and dropping the indices 
T yields: 

= - ( n + "% 1 |f n,n+1 *n-l + ^ " [&„(*„) " A„(0)] . (4.47) 

Using the quantities and n ,n+i and the dimensionless impact vectors Xj = €j/Dj 
shows that the first term on the rhs of (4.47) can be rewritten as 

(1 + Z n _i)D n n+ i 2 

77T — V7i = -^n^n+ix n _i ; (4.48a) 

for the second one, using the equations (C2) and (C5) of Paper I, we obtain: 

Dn - 1 ' n+1 \ = D n+1 (l + vi)* n . (4.486) 



D 



n — l,n 

With the definition of the scaled deflection angle ct := ^p-a, we find 

£>n,n+i [a«(x„) - a„(0)] = D n+1 (3 n , n+1 [a„(x„) - a„(0)] ; (4.48c) 

inserting the equations (4.48) in (4.47) yields the dimensionless recurrence relation for 
the impact vectors Xj in the lens planes 

x n+ i = (1 + vl) x n - w^x n _i - /5 n , n+ i [a n (x„) - a„(0)] , 1 < n < iV . (4.49) 

We transform the center of the coordinate system in each lens plane such that 

x'j—Xj-^PijCLiiO) , (4.50a) 

2=1 

define 

a;(xj) :=«,(x,) (4.506) 

and obtain with (C8) of Paper I and the comment below this equation in Paper I, the 
recurrence relation one uses in lens theory [see Paper I, equation (2.19)]: 

<+i = (l + 0<-^<-i-^ + i<(xl) • l<n<iV . (4.51) 

Whereas (4.49) describes the mapping of a ray relative to a reference ray, which is also 
deflected at every inhomogeneity, (4.51) describes the mapping of a ray relative to the 
'optical axis'. This optical axis can be constructed by piecewise smooth null geodesies 
(of the empty cone metric) connecting the (new) centers of the coordinate systems on 
consecutive lens planes with each other; thus this optical axis represents a kinematically 
possible ray, but not necessarily an actual light ray (see Fermats principle in SEF, e.g., 
Chapt. 9.2). It has been shown already in SEF that the formulation (4.51) of the multiple 
lens plane equation is equivalent to the more familiar one (now we drop the primes), 



i-i 



Xl -^PijOLifc) , 1 < j < N + 1 , 



2=1 

24 



for the special case j = N + 1, we obtain with /3j,jv+i = 1 f° r the source position that 



Therefore, we have shown in this chapter that the equations describing the mapping of a 
light ray and that of a light beam in gravitational lens theory can be derived with a series 
of well defined approximations from the description of light propagation in geometrical 
optics. In essence, the multiple deflection gravitational lens equation can be viewed as 
a discretization of the exact propagation equation (2.4), applied to the case of weak 
gravitational fields (but not necessarily weak matter inhomogeneities) . 

4.5 Remark on Fermat's principle 

In SEF, Sect. 4.6, the derivation of the lens equation was based on a relativistic version 
of Fermat's principle. The argument leading to the geometric contribution to the time 
delay, eq. (4.65), p. 145 in SEF, suffers from an apparent inconsistency. On p. 143, it 
is first stated that light rays from the source to the neighborhood of the deflector and, 
after deflection, those from that neighborhood to the observer, form 'shearfree beams 
... subject only to the focussing of the smooth part of matter', i.e. to &Pf; but the 
subsequent calculations are said to be based on the large-scale RW metric which is 
related to the average density, pp- This, however, presents an apparent difficulty only. 
In the 'empty' region, outside clumps, the shear of light beams is assumed negligible 
there. Now, it is known that the only conformally flat non-static dust spacetimes are 
Friedmann ones (see Kramer et al. 1980, Sects. 22.2, 32.42, 32.5). Therefore, it seems 
reasonable to approximate the universe in 'empty' cone regions by a Friedmann model 
whose mean motion equals that of the large-scale background model, but whose density 
is ap-p- This implies that the metric, ds , is related to the large-scale metric, ds 2 , by a 
constant conformal factor, 



— ^ 

Therefore, the spatial paths of light rays in empty regions are the same for ds as for 
ds 2 , viz. geodesies 'of dcr^', and the reasoning on p. 144/145 leading to eq. (4.65) applies 
without change, since that equation is invariant under a constant rescaling of the RW 
metric. (Angles and the redshift z& remain unchanged, and the distances cZ\ geom ., D^, 
D s , Dds are rescaled by the same factor.) 

5 The magnification of the flux of light beams 
5.1 The flux of a radiation field, magnification factor 

The monochromatic flux of a radiation field, measured by an observer at frequency 
co, is given by the product of its specific intensity 1^ and the solid angle dO the source 
subtends on the observers sky: = I^dQ. The specific intensity at the observer is 
related to that at the source by the conservation of the phase space density of photons. 



N 




i=l 



ds 2 = fi-Ms 2 = a^R 2 ^) {drj 2 - dal} 
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This implies, according to SEF, Sect. 3.6, that for any non-interacting radiation field the 
scalar jjg is observer-independent, i.e., independent of his four velocity, and constant on 
a light beam: 

a; 3 (A) w s 3 a; 3 ' 1 j 

where A is the affine parameter of the central light ray of the beam, to(0) =: u>o and 
uj(X s ) =: m s . 

Consider an infinitesimal monochromatic source radiating with frequency co s , and 
observed with frequency u; its observed flux depends on the source of shear and 
convergence along the beam connecting source and observer. Changing these source 
terms such that the frequency at the observer and the affine parameter-redshift relation 
stays the same, then, for the same observer, the observed flux of the source changes 
according to (5.1) to S w = S®j^o, with 5° being the flux before changing the source 
terms. In an on-average Friedmann universe, the frequency of the light is not changed 
by the deflection and, by definition, the affine parameter-redshift relation is not affected 
by the clumps. Hence, we can compare the flux S w of the source with the case where 
there are no intervening clumps between source and observer, and obtain for the ratio 

di? 

" : =sTd^ • ' (6 ' 2) 

ix is the so-called magnification factor; if /i > 1, the light beam is called magnified relative 
to the empty beam, di? and di?° are the solid angles which the source subtends on the 
sky for the cases with and without clumps in the beam. If we use (2.7), we obtain that 
the magnification //(A) of a source at the affine parameter A compared to the case where 
the source is observed through the empty beam, can be described as 



MA) 



detV°(X) 




D\X) 




1 


detV(X) 




detX>(A) 




det^(A) 



(5.3) 



For the second equality we have used that for the empty beam, the Jacobi matrix is 
given by X>°(A) = D(X)I, with D(X) being the angular diameter distance of the empty 
beam, i.e., the solution of the Dyer-Roeder equation (4.27) with boundary conditions 
(4.28). The third equality follows from the definition of the dimensionless Jacobian 
matrix A(X) = (A). Hence, the discussion of the matrix A or the magnification 

factor n in gravitational lens theory always implies the discussion of light propagation 
relative to the empty beam case. This point of view is reasonable: 

1) As long as there are only a few clumps, i.e, if 1 — a is small, most light beams are 
empty cone beams. Therefore, the magnification factor in (5.3) describes the observed 
flux of a source whose beam is distorted between source and observer, relative to the 
most typical case, where the beam is not distorted. 

2) The other extreme is the case where 1 — a becomes approximately one: the source 
of convergence becomes extremely small, and for the description of the very few light 
beams that do not traverse a matter inhomogeneity, one cannot neglect the source of 
shear, which is different along every individual beam. Hence, there does no longer exist 
a typical light beam, and the definition of the magnification factor as in (5.2) and (5.3) 
has no illustrative meaning: it compares the flux of the considered light beam with that 
of a ficticious beam. 
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3) As mentioned at the beginning of Sect. 4, for a weakly inhomogeneous universe (e.g., 
if one considers spatial scales on which the matter inhomogeneities are (quasi-)linear), 
the magnification is defined relative to that of the smooth Friedmann-Lemaitre universe. 
In this case, the angular diameter distances D(X) are those obtained from (4.27) with 
(5 = 1, and p c \ = 8p (the density fluctuations) can have either sign - therefore, 7Z C \ no 
longer is non-positive. 



5.2 The relative focusing equation 

The focusing equation (2.8) describes the evolution of the angular diameter distance of 
a light beam due to the Ricci-focusing and the shear rate of the beam. In the case 
of an on-average Friedmann universe, all light beams have the empty cone background 
density as a common contribution to their focusing, and different additional source terms 
due to the clumps. Therefore, we want to derive a differential equation which describes 
the evolution of the beam relative to the empty beam; the source terms of this relative 
focusing equation are then produced by the clumps only. 
Consider the differential equation 

d 2 

w(X) = [h(X) + c(X)]w(X) , (5.4) 



dA 2 



and let w(X) be the (unknown) solution of (5.4) with boundary conditions w(0) = and 
w(0) = 1. Assume that v(X) is the well-known solution of (5.4) for the case c(A) = 0, 

d 2 

" V (X) = h(X)v(X) , (5.5) 



dA 2 



with the same boundary conditions: v(0) = 0, v(0) = 1. We define a strictly monotoni- 
cally decreasing function X(X) by 



X 



so that X(X niax ) = 0; the value of A max will be specified below. Then, inserting the 
equations (5.5) and (5.6) in (5.4), we obtain for the ratio a := ^ the differential equation: 

(> '" a(X) = v 4 (X)c(X)a(X) ; (5.7) 



dX 2 

Using w(X)\\ = o = = v(X)\\ = o, the boundary conditions for a become, as a function of 
A, 

a(A)| A=0 = l , -^a(A)| A=0 = . (5.8) 

We interprete 8 (5.4) by inserting h = 1Zb g (X) and c = 1Z C \(X) — |cr(A)| 2 ; then, w and v 
denote the angular diameter distances of the 'actual' beam considered and that in an 
empty cone, respectively. Therefore, (5.7) describes how the considered light beam is 



One can calculate the relative magnification of two light beams with (5.7) even in a case of a non 
Friedmann universe, if the affine parameters of these light beams are the same (e.g. as a function 
of redshift). 



27 



focused relative to the empty beam and is therefore called relative focusing equation; the 
solution of (5.7) can be described, with 

v(\) = D(\) = -jjrr(z(\)) , 

as 

a(A) = jjjxjSQ [det V] (A) = SQ [det A(X)} . (5.9) 

The inverse of (5.9) yields the magnification of the beam at a position A:|a(A)| 2 = /j,~ 1 (X). 
We can identify -j^X with the cosmological x-function, defined in equation (4.68) of SEF, 
since 

dX ( c \~ 2 1 



dA \H J r 2 (A) 
yields, if we put A max = lim^oo X(z) and use equation (4.3), 

* W= (i) T r-MP^ + ^ fe) ' XW ' <5 ' 10) 
Inserting (5.10) and (5.9) in (5.7), the relative focusing equation can be rewritten as 

^«(X)= (^) 2 r 4 (x) [^d-k(x)| 2 ] a( X ) , a( X ) = SQ [det A] ( X ) . (5.11) 

Note, that due to the strictly monotonic behaviour of x and A as functions of z, we can 
consider any variable on a light ray as a function of z, A or %. 

5.3 The focusing theorem 

The non-positiveness of the source term 1Z C \ — \a\ due to the clumps in the focusing 
equation shows that a beam propagating through clumps is always more focused than the 
empty beam (in the absence of conjugate points between source and observer). Hence, as 
long as the beam has not formed its first conjugate point, the angular diameter distance 
must not be greater than that of an empty comparison beam at the same redshift. This 
so-called focusing theorem can be restated with the use of the relative focusing equation: 
As long as the light beam has not formed its first conjugate point, the function a(X) is 
alway between one and zero, 

1 > a(A) > fi(X) > 1 , < A < A c , (5.12) 

or, the light beam is not demagnified relative to one in an empty cone. This can be 
proven immediately: Using the boundary conditions of a (A) described in equation (5.8) 

, the second derivative of a in (5.11) 

a is always between one and zero in 



and that, due to the non-positiveness of 
is always non-positive, one obtains that the value o 



the interval between the vertex and the first conjugate point of the beam. One can also 
prove a stronger statement: as long as the beam has not passed a conjugate point, the 
function a(A) is monotonically decreasing. 

Proof: Since % tends to plus infinity at the vertex, and lim x ^oo ^(x) = 0> one can write 
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and therefore, ^ can be rewritten with as 

Since the integrand is non-positive, < follows, q.e.d. 

6 Summary and conclusions 

We have investigated the propagation of infinitesimally small light beams in arbitrary 
spacetimes and derived a Jacobi-type differential equation for the matrix providing the 
linear mapping from the inclination angle of a light ray of the beam to the separation 
vector at arbitrary values of the affine parameter. This matrix carries full information 
about the size, shape, orientation and twist of the beam. We have then concentrated 
on the investigation of the behaviour of light beams near a vertex and near conjugate 
points; in particular, we have derived asymptotic representations of the optical scalars 
near such points. It was pointed out that near a vertex and a focus, the twist of a beam 
is a higher-order contribution to the Jacobi mapping than are expansion and shear. 

We then turned to the special case that the metric is that of a perturbed Friedmann 
universe, i.e., where the overall geometry of the universe is described by a Friedmann 
metric, which however is locally modified to allow for matter inhomogeneities. If the 
matter inhomogeneities are considered to be weak, so that they can be described locally 
by a post-Minkowskian metric, and geometrically-thin and isolated, so that typical light 
beams are propagating most of the time through the background Friedmann metric, the 
influence of the matter inhomogeneities on the light beam can be described by a sum of 
delta-distributional contributions to the source term of the Jacobi equation for the linear 
mapping mentioned above. In this way, we have derived the equations of gravitational 
lens theory, which represents an approximation to the exact propagation equations which 
is particularly useful for, and applies to, most astrophysically relevant situations of light 
propagation in the universe. We want to point out that in contrast to earlier treatments 
of the lens equations (e.g., SEF, Sect. 4.6), we have made no use of the existence of an 
optical axis relative to which the impact vectors are defined; instead, our reference ray 
is a physical, i.e., deflected, light ray. To relate our formulation to the earlier treatment, 
a redefinition of the coordinate frames in the lens planes was performed which yielded 
the lens equation in the standard form. We remind the reader that a derivation of 
the gravitational lens equation can also start from Fermat's principle (see Blandford & 
Narayan 1986, SEF, Sect. 4.6 and references therein); however, the derivation presented 
here appears to be more dircet in that one does not make use of geometrical constructions 
for the calculation of the 'geometrical time delay', which are less easy to justify in an 
'on-average-Friemann-universe' than the approximations used here. The advantage of 
our derivation of the lens equations lies in its explicit listing of approximations which 
have to be made. All but two are not critical and well satisfied in astrophysically relevant 
situations. The two which are as yet not very well understood are: (1) The source of shear 
was assumed to vanish between two consecutive lens planes. (2) It was assumed that the 



a(x') dx' 
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metric of a clumpy universe can be written locally as a post-Minkowskian modification of 
the standard Friedmann metric. Note that a number of investigations have suggested the 
validity of this latter approximation (e.g., Futamase & Sasaki 1989, Jacobs et al. 1993). 
The former assumption certainly has to break down if the universe is highly clumpy, i.e., 
for ail of order unity. However, since it seems that the dumpiness of our universe is 
much smaller than unity, we conclude that the (multiple deflection) gravitational lens 
equations provide a useful and fairly accurate approximation in most relevant cases. 

Finally, we have derived an equation for the size of a light beam in a clumpy universe, 
relative to the size of a beam which is unaffected by the matter inhomogeneities. If we 
require that this second-order differential equation contains only the contribution by 
matter clumps as source term, the independent variable is uniquely defined and agrees 
with the %-function previously introduced [see SEF, eq. (4.68)] for other reasons. This 
relative focusing equation immediately yields the result that a light beam cannot be less 
focused than a reference beam which is unaffected by matter inhomogeneities, prior to 
the propagation through its first conjugate point. In other words, no source can appear 
fainter to the observer than in the case that there are no matter inhomogeneities close 
to the line-of-sight to this source, a result previously demonstrated for the case of one 
(Schneider 1984) and several (Paper I, Seitz & Schneider 1994) lens planes. 
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